Method and Apparatus for Content Processing Application Acceleration

ABSTRACT

A network architecture enables the data flow of packets to be dynamically modified depending upon the operational needs of the packet being processed. This allows for separate processing of control and data path operations, as well as providing a mechanism for functions that require high computational support, such as encryption functions for example, to be offloaded onto processing devices that can support such functions. Other, less computationally intensive or lower priority functions can be forwarded to PEs having lower operation capacity. With such an arrangement, a dynamic mechanism is provided for customizing the data path of a network device to accommodate the computation needs of any application.

CROSS REFERENCE TO RELATED APPLICATIONS

This is a continuation of U.S. patent application Ser. No. 10/692,842, filed Oct. 24, 2003, which claims priority to U.S. patent application 60/421,009 filed Oct. 24, 2002, entitled “Content Processing Application Acceleration” by Subramanian.

FIELD OF THE INVENTION

This invention relates generally to the field of communication and more specifically to a method and apparatus for communications content processing.

BACKGROUND OF THE INVENTION

Content Delivery Services (CDS) are applications that enable or enhance the quality of experience of the end-to-end transfer of data over a network. For example, Content Delivery Services include Security Processing, Virus Scanning, Policy Enforcers, Load Balancing, Network Address Translation processing and Streaming Content Caches. In the past the content delivery services have been layered on top of the services provided on an end system. However, executing the CDS on the end system adversely impacts the ability of the end system to perform its intended function, which reduces the performance and increases the cost of the end systems themselves. To overcome these problems, the provision of Content Delivery Services has migrated from the end user systems to the network edge.

However, network edge architecture is designed to optimize the performance potential of the communication medium, and therefore is not always the best architecture for providing high quality Content Delivery Services. Typically Content Delivery Services are provided at the edge by appliances that are either pure software or hardware assisted software processes. While software processing enables a larger number of individual appliance offerings to be made to the user, it does not provide the optimum per appliance performance, and is also inefficient in terms of power, area and cost. Hardware assisted architectures typically include one or more customized ASICs coupled to the network device via an external bus architecture, such as the Peripheral Computer Interconnet (PCI) bus. The customized ASICs are designed such to provide optimal per appliance performance. However, if multiple appliances are provided in the network offering, bottlenecks quickly arise at the PCI interface. In addition, the hardware assisted architectures are expensive, and inefficient in terms of power, area and cost.

It would be desirable to identify a network architecture which would support the offering of various Content Delivery Services appliances while enabling desired power, area, cost and bandwidth constraints to be met.

SUMMARY OF THE INVENTION

According to one aspect of the invention, a network device includes a network interface for transferring a packet between the network device and a network, at least two processing engines, and a frame steering processor, disposed between the network interface and the at least two processing engines. The frame steering processor is programmed to forward a packet received at the frame steering processor to either the network interface card or one of the at least two processing engines responsive to a treatment characteristic of the packet.

According to another aspect of the invention, a method of processing packets received at a network device includes the steps of receiving a packet at a network interface of the network device and forwarding the packet to a frame steering processor, coupled to the network interface. The frame steering processor acts, responsive to a treatment characteristic of the packet, to forward the packet to either one of at least two processing engines or the network interface.

According to still a further aspect of the invention, a method of architecting a network device includes the steps of apportioning a plurality of functions to be performed by the network device into a plurality of groupings based on the relative complexity of the functions. At least one of the functions is a frame steering function. The method includes the steps of selecting processing devices for performing the functionality of each grouping of the plurality of groupings, the processing devices being selected based on predetermined design criteria. The selected processing devices are coupled with a frame steering processing device associated with the frames steering function, wherein the frame steering processing device includes switch functionality for forwarding a packet received at an input of the frame steering device to one of the selected processing devices in response to a tag value of the packet. The method further includes the step of programming each of the processing devices to control the flow of packets through the frame steering processor by updating the tag value of the packet when it is received at the respective processing device. With such an arrangement, the network device can be programmed to ensure that packets are forwarded to the appropriate type of processing device depending upon the operating needs of the packet. Network specific packets can be forwarded to network specialized processing engines, control type functions can be forwarded to high performance processing devices, and data path type functions can be performed within the frame steering processor itself, or at a dedicated frame steering processor. Because each CDS has different needs, the flow through the network device can be customized to the exact need of the application. Thus, this flexible, dynamic architecture allows Content Delivery Services to be provided in a network where the high performance capacities are realized.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a network architecture for optimizing performance of content delivery services;

FIGS. 2A and 2B illustrate prior art Switch and Appliance architectures;

FIG. 3 illustrates one embodiment of a high performance switch/appliance architecture incorporating features of the present invention described with reference to FIG. 1;

FIG. 4 is a block diagram illustrating one embodiment of a unified switch/applicance network architecture of the present invention;

FIG. 5 is a diagram illustrating one embodiment of the unified switch/appliance architecture of FIG. 4;

FIGS. 6A and 6B are functional flow diagrams and block diagrams, respectively, of the flow of a packet through a network device architected according to the present invention; and

FIGS. 7 a and 7 b are diagrams illustrating the separate processing paths that packets in a common flow may take for dedicated control or data handling according to the present invention; and

FIG. 8 is a block diagram illustrating exemplary components that may be included in a frame steering processor included in network device architected according to the present invention.

DETAILED DESCRIPTION

Referring now to FIG. 1, a modular Content Delivery Services (CDS) network gateway architecture 10 is shown. The modular architecture 10 includes a number of Data Processor Engines (DPEs) 12 and a number of Control Processing Engines (CPEs) 14 coupled via interconnects 15 and 16, respectively to a high performance interconnect 18. The high performance interconnect is coupled to one or more network interfaces 20 and 22 enabling communication between the CPEs and DPEs and a network.

According to the present invention, the network gateway architecture 10 may be designed to include DPEs and CPEs having a range of processing engine capabilities. Thus, DPE 12 a may provide more capabilities or perform at a higher speed than DPE 12 b. In addition, the interconnects that couple the various DPEs and CPEs to the high performance interconnect may have different bandwidth capabilities, or different physical designs. In one embodiment, the range of processing capabilities and interconnect types is selected to provide features that would support a variety of appliances. Selection of the particular DPE or CPE component can be streamlined to provide optimum performance for each particular appliance by programming the datapath taken by the respective packets or data flows through the architecture. According to one aspect of the invention, the path that a particular packet or data flow takes through the network gateway architecture 10 is programmable, thus providing a flexible architecture that can support either (or both) of packet based and flow based processing. The steering of a packet or flow through a particular path of the DPEs and CPEs is controlled by a tag that is appended to each packet that is forwarded through gateway. In one embodiment, a CPE or DPE that receives a packet is programmed to process the packet and modify the tag on a packet to direct the packet to another CPE or DPE in the gateway for processing. Alternatively, the tagging function may be centralized in a single entity performing the function of a FSP. The FSP may prepend all the required tags before any processing begins allowing a tagged packet to be routed among various CPEs and DPEs or it may prepend one tag (denoting one CPE or DPE) and modify it when it gets the packet back and repeat until all required processing has been completed

With such an arrangement, the flow of packets through the gateway may be dynamically modified depending upon the operational needs of the appliance associated with the packet being processed. This permits one architecture to easily support a wide range of appliances, each having different computational needs. For example, such an arrangement allows for separate processing of control and data path operations, as well as providing a mechanism for functions that require high computational support, such as encryption functions for example, to be offloaded onto processing devices that can support such functions. Other, less computationally intensive or lower priority functions can be forwarded to PEs having lower operation capacity. With such an arrangement, a dynamic mechanism is provided for customizing the data path of a network device to accommodate the computation needs of any application.

The above network gateway architecture can be implemented in hardware at a variety of granularity levels to improve the performance of existing architectures. For example, referring now to FIGS. 2 a and 2 b, the concepts of the present architecture may be used to enhance existing high performance switch and appliance architectures. One exemplary switch architecture is shown in FIG. 2 a. As mentioned previously, in general the switch architectures are specifically designed to support high speed network switching. Other functionality is generally added to the switch via an appliance, such as appliance 26. The appliance is generally coupled to a host Central Processing Unit (CPU) via a Peripheral Computer Interconnect (PCI) connection 13, and uses some of the processing capabilities of the host to perform their function. The communication rate of the PCI interconnect often causes a bottleneck that results in sub-optimal performance of the appliance.

To improve the performance of appliances at the switch, appliance geared architectures such as that in FIG. 2 b developed. The appliance architectures may include a high performance CPU for control plane processing , coupled to a Field Programmable Gate Array that is customized for data plane processing. The performance of the appliances is improved as a result of the dedicated processing power, but the impact of sharing the PCI bus between the control and data processor often lowers the expected performance.

However, referring now to FIG. 3, a high performance switch/appliance architecture can be provided by applying the modular architecture described in FIG. 1 to address the needs of the existing switch and appliance architectures. For example, in FIG. 3, a Frame Steering Processor and High Speed Interconnect (FSP/HSI) pair 29 is disposed between modular DPE and CPE components. The Network Interface Module 23 is used for communicating with the network. The DPE 47 may include, for example, a Network Processor Unit (NPU) and optional silicon for providing high speed support for the data path of the switch application. The FSP/HSI pair 29 is further coupled to CPE 45, where the CPE may be, for example, a CPU and optional silicon for providing support for the control path of the switching application. Also coupled to the switch via the FSP/HSI is an Application Module (AM) 49, which may be either a CPE or DPE, depending upon the specific appliance with which the application module 49 is associated. In one embodiment, multiple application modules (AMs) may be provided, each having differing processing capabilities that may allow the device to be better suited to perform specific types of applications. Optional accelerators, indicated as dashed Hardware (HW) blocks may be used to further enhance the capabilities of any of the CPEs or DPEs.

Thus, the improved modular switch/appliance architecture of FIG. 3 may be used to provide appliance functionality to the switch architecture without affecting the performance of the switch, and also optimizing the performance of the appliances. The architecture is flexible and scalable. High speed interconnects coupled to the Frame Steering processor may allow additional functionality to be added to the switch by the insertion of blades to the switch. Added appliances may advantageously be programmed to take advantage of the underlying architecture by apportioning their flows into control and datapath specific flows, thereby improving the performance of the appliance.

It should be noted that FIG. 3 illustrates only one embodiment of an integrated switch/appliance architecture, and although certain functionality has been shown as logical separate entities, this is not a limitation of the invention. For example, Frame Steering Processor (FSP) 29 may either be integrated into the existing switch as shown in FIG. 3, or provided on a separate blade coupled to the switch.

Referring now to FIG. 4, an alternative embodiment of the network gateway architecture 10 is shown, where the CPEs and DPEs are selected in a manner that unifies the switch/appliance processing capabilities to reduce duplication of processing components, yet allow for the optimization of appliance performance. In order to ensure that the broadest range of appliances can be supported, consideration needs to be paid to what types of functions are provided at the respective Processing Engines, and the communication bandwidth potential of each link to each PEs. Determining which operations should be performed by which types of Processing Engines (PEs) is largely driven by design criteria including power, area, cost, time to market and performance, among other criteria. For example, if time to market is a key design pressure, the use of a custom designed ASIC would be outweighed in favor of an off the shelf processor that could perform some functionality, with a lower degree of performance.

Generally speaking, however, certain functions are best performed by a dedicated Datapath PE because of its particular architecture and communication bandwidth. These functions may include, but are not limited to Network Address Translation, Bulk encryption/decryption, compression/decompression, pattern matching functions. Because one or more DPEs may be provided in the design, each having different interconnect bandwidth capabilities, different functions may be allocated to different DPEs.

Similarly, different Control PEs having different capabilities may also be included in the design. For example, a high performance CPE, architected to support complex and high speed calculations may be used to provide support for load balancing algorithms, SSL protocol handshakes, TCP handshakes and the like. Another CPE of lower performance or reduced bandwidth capabilities may be used to provide host/management functionality such as filter preprocessing, action and alert post-processing.

FIG. 4 is, a block diagram of one embodiment of a unified appliance/switch architecture, wherein various DPEs and CPEs having specified functionality have been selected. In FIG. 4, the network gateway is shown to include the FSP/HSI pair 29.

Coupled to the FSP/HSI pair 29 are a set of one or more application modules 52. The application modules can be used to provide enhanced functionality for a given application, and may include optional acceleration silicon, depending upon design needs. Each Application Module (AM) can be either a custom or off the shelf device. Each AM is coupled to the FSP/HSI via a high speed packet interface, such as interface 50, although it is understood that the capabilities of each packet interface may vary.

Referring now to FIG. 5, one embodiment of the unified architecture of FIG. 4 is shown, wherein an Accelerated Flow and Application Processor (AFAP) incorporates the functionality of the Frame Steering Processor, High Speed Interconnect and the DPE 45 (providing data path processing for the switching application). In the embodiment of FIG. 5, a Field Programmable Gate array (FPGA) is used to provide the High Speed Interconnect functionality. The FPGA may also implement the FSP functionality, or alternatively portions of the FSP functionality may be implemented by the NPU. As shown in FIG. 4, the FPGA is disposed between a Network Interface 40 and a pair of Processing Engines (PEs) including an NPU 36 and a Network CPU 32. In one embodiment the FPGA is designed using the Xilinx 2VP100 FPGA.

Also coupled to the AFAP via a connectorized high speed interconnect is a CPU-based Application Module which may be used for control plane or data plane processing or both. 40. In one embodiment, the High Performance Networking CPU 32 is used for Control Plane processing, and may be, for example, a Motorola MPS 8540, by Motorola Corporation. the PMC-Sierra RM9000 (RM9K) by Sierra, and other alternative processor, or customized ASICS. The Network Processing Unit (NPU) 36 may be dedicated to performing specific network functions and data plane processing at high performance rates. An exemplary device that may be used for data plane processing by NPU 36 is the Intel IXP2800/2850 Castine device, by Intel Corporation, although similar devices may alternatively be used and the present invention is not limited to any particular device.

According to one embodiment of the invention, as packets are received from the Network Interface (NI) 40, they are forwarded to the FPGA 34. The FSP examines certain fields of the packet to identify the type and function of the packet. The exact field of the packets may vary depending upon the communication protocol, but in essence the FSP examines the header fields to identify certain aspects such as the service level of the packet, the security parameters, the type (control, data), etc.

In one embodiment, the FSP automatically forwards the packet to the NPU for first processing. In an alternative embodiment, the FSP includes additional filtering functionality that enables it to determine whether the packet is a control type packet or a data type packet. Using this information, the FSP appends a tag to the header, and forwards it to output logic in the FSP. The tag effectively acts as a port number, enabling the FSP to switch forward the packet to the appropriate attached PE.

When the PE receives the packet, it processes it as necessary and modifies the tag to indicate its next destination PE, and forwards the packet back to the FSP. The FSP may either pass the packet through the switching fabric directly, or optionally may be programmed to alter the destination PE indicated by the tag to re-direct the packet. Such a re-direction may be desirable when the originally indicated PE is busy and the operational task associated with the packet can be handled by another PE.

Thus, various embodiments of the present invention have been shown. In FIG. 3, a modified high performance switched appliance architecture is enhanced to include frame steering capabilities between the host and custom blades. In FIGS. 4 and 5, a unified switch/appliance architecture has been described that is architected to enable the best Processing Engine to be selected for any given task, whether the task is switch or appliance oriented, or whether it is control or data oriented. In FIG. 5, the FSP functionality and the High-Performance Interconnect are shown combined into an integrated implementation on an FPGA device. The two different embodiments illustrate that the present invention is not limited to the forwarding of a packet, task or flow to any dedicated processing engine. Rather, the optimum engine can be dynamically selected depending upon the needs of the packet, task, or flow, and also on the current operating status of the other PEs in the gateway. Thus, it is envisioned that, if a given CPE is busy, the FSP may be programmed to alter the flow of the packet to any available CPE or DPE.

Accordingly, both the PEs and the FSP may be dynamically programmed to redirect traffic through the network device to accommodate the operating needs of a given appliance. Of course the operating needs of a given appliance may vary. For example, the appliance may be a packet processing application where the same function is performed on every packet of a given flow (the headers of a packet are used to identify a flow) or a flow processing application where a function is applied to the control portion of the flow and a different function may be applied to the data portion of the flow. Whether an appliance is a packet processing application or a flow processing application, the FSP appropriately modifies the tags to direct the respective packets to the appropriate destinations for processing using the information in the header of the packet.

Referring briefly now to FIGS. 6A and 6B, the flow of a packet through the network device will now be described with reference to flow diagram 6A and diagram 6B. Note that FIG. 6A is a functional flow diagram, illustrating steps that are taken at the different devices to complete the transaction.

At step 100 the Network Interface receives a packet, and forwards it to the FSP at step 102. At step 104, the FSP receives the packet, analyzes the packet to determine the destination and forwards it to the NPU at step 106. At step 108, the NPU receives the packet, at step 110 processes it and at step 112 updates the tag and forwards it to the FSP. Note that the NPU had determined that the next operative task to be performed on the packet is to be performed by the Network CPU. Thus, at step 116 the FSP forwards the packet to the Network CPU. The network CPU receives the packet, processes it, and determines that more work needs to be done on the packet by the NPU. It therefore updates the tag with the NPU identifier, and forwards it to the FSP. The FSP receives the packet, and forwards it through the FSP to the NPU at step 126. The NPU completes the work on the packet, and updates the tag at step 132 to indicate that it should be forwarded to the network interface. The FSP receives the packet and forwards it through to the network interface, which subsequently forwards it to the network at step 140. The path of the packet through the network device is shown in FIG. 4B.

Above it has been described that the tag is a field appended to the packet. In one embodiment, the tag is a multi-bit field, which may be used to either encode an identifier or set a flag indicating the output port that the packet is to use to exit the FSP, where the output port corresponds to the coupled Processing Engine or network interface. Other methods that are known in the art of identifying output ports using tags that are either pre-pended to a packet header or placed in other fields within the packet may be substituted herein, and the present invention is not limited to any particular form of tag.

Referring now to FIGS. 7A and 7B, a flow diagram and block diagram, respectively, illustrating the separation of control and data plane processing of a control flow is shown. In FIG. 7A, at step 200 a flow is shown received at the network interface, and forwarded to the FSP at step 204. The FSP determines whether the packets are control plane or data plane specific packets at step 206, and forwards the packets to the appropriate CPE or DPE. Control plane specific packets are shown, in this embodiment, to be forwarded directly to the application specific module for handling via dashed path 82. Data plane specific packets are forward on the dotted path 80. Alternatively, the control plane packets could be forwarded to the NPU for processing either prior to forwarding to the application specific AM, or at any time thereafter, and the present invention is not limited to any specific pathways of the control or dataplane packets. Rather, FIGS. 7 a and 7 b illustrate that packets from the same flow may be forwarded to different processing modules to obtain the optimum performance for both control plane and data plane processing. Following processing by the CPE or DPE, the packets are shown to be returned to the FSP for forwarding out the network interface.

Referring now to FIG. 8, a block diagram illustrating exemplary components that may be included in the FSP are shown. It should be noted that although only two ports are shown here, as described above there may be multiple ports, and the present invention is not limited to any particular arrangement of port. Each port is shown to include a bi-directional connection, with input from each port being forwarded to an input buffer 50. The input buffer is used to collect the packet until sufficient information is obtained to allow it to proceed to the filter logic 52, or alternatively as a holding device for holding packets while others are being processed by the FSP. The filter 52 parses the header of the packet to determine certain treatment characteristic of the packets that may be used to determine which port (i.e., processing engine) should be used to process the packet. The treatment characteristics may include but are not limited to, a quality of service at which the packet is to be delivered, a security association that is to be applied to the packet, a type of operation that is to be performed on the packet, a stage of operation of the packet (i.e., whether other stages of processing need to be performed) a type of the packet, whether the packet is a control plane packet or data plane packet, whether the appliance is a flow based application or packet based application among other things. The filter strips this information and forwards it to the port selection table 54. The port selection table 54 includes a look-up table which uses the criteria received from the filter to access any pre-programmed port selection information stored in memory, to select a desired output port and, by default, a processing engine to which to forward the packet.

When the port has been selected, it is forwarded to the tag logic 58, which builds a tag to append to the header of the packet. The tag logic additionally drives a selector 56 which forwards the packet to a correct port output buffer 60 or 62, depending upon the selected tag.

Other features may also be included in the FSP, and it should be noted that FIG. 8 is used to show only one representative implementation of the above described frame steering functionality.

Accordingly, an architecture, method and apparatus has been shown and described that enables appliances, such as Content Delivery Services to be executed on networking devices without impacting the optimal performance of either the network device of the CDS. The described gateway of FIGS. 1-7 may be implemented in a variety of configurations, including as a single board with the integrated switch function and application processors, as a mother board with switch function, and add on application processor daughter cards, or as a blade chassis with switch and application blades.

Having described various embodiments of the invention, it is understood that the present invention is not to be limited in scope by the specific embodiments described herein. Indeed, various modifications of the present invention, in addition to those described herein, will be apparent to those of ordinary skill in the art from the foregoing description and accompanying drawings. Further, although the present invention has been described herein in the context of a particular implementation in a particular environment for a particular purpose, those of ordinary skill in the art will recognize that its usefulness is not limited thereto and that the present invention can be beneficially implemented in any number of environments for any number of purposes. For example, though the invention has been described in the context of a networked system on which content delivery services may be layered, it will be apparent to the skilled artisan that the invention is applicable in any system which may need to support appliance applications. The skilled artisan will realize that there are many equivalent ways of implementing the described functionality, and all such modifications are intended to fall within the scope of the following appended claims. Accordingly, the claims set forth below should be construed in view of the full breadth and spirit of the present invention as disclosed herein. 

What we claim is:
 1. A network device comprising: a plurality of network interfaces for transferring a packet between the network device and a network; at least one processing engine of a first processor type; at least one processing engine of a second processor type; a frame steering processor, coupled between the network interfaces and the processing engines, the frame steering processor being configured: to determine whether the packet is to receive content services treatment; responsive to determining that the packet is not to receive content services treatment, to forward the packet to one of the plurality of network interfaces; and responsive to determining that the packet is to receive content services treatment: to determine a processor type of a processing engine to provide content services treatment to the packet; to select a processing engine of the determined processor type; and to forward the packet to the selected processing engine of the determined processor type.
 2. The network device of claim 1, wherein the frame steering processor is configured to determine the processor type of the processing engine to provide content services treatment to the packet based on an operation to be performed on the packet by the processing engine.
 3. The network device of claim 2, wherein the frame steering processor is configured to determine the processor type of a processing engine to provide content services treatment to the packet based on an operation to be performed on the packet by the processing engine by determining the processor type of the processing engine to provide content services treatment to the packet based on complexity of the operation to be performed on the packet by the processing engine.
 4. The network device of claim 3, wherein the at least one processing engine of the first processor type is designated for performing operations of greater relative complexity and the at least one processing engine of the second processor type is designated for performing operations of a lesser relative complexity.
 5. The network device of claim 2, wherein the frame steering processor is configured to determine the processor type of processing engine to provide content services treatment to the packet based on an operation to be performed on the packet by the processing engine by determining the processor type of the processing engine to provide content services treatment to the packet based on an order of the operation to be performed on the packet by the processing engine.
 6. The network device of claim 2, wherein the frame steering processor is configured to determine the processor type of the processing engine to provide content services treatment to the packet based on an operation to be performed on the packet by the processing engine by determining the processor type of the processing engine to provide content services treatment to the packet based on an operation type of the operation to be performed on the packet by the processing engine.
 7. The network device of claim 6, wherein the at least one processing engine of the first processor type is designated for performing operations of a first operation type and the at least one processing engine of the second processor type is designated for performing operations of a second operation type.
 8. The network device of claim 7, wherein at least one processor type is designated for performing packet-based applications.
 9. The network device of claim 7, wherein at least one processor type is designated for performing flow-based applications.
 10. The network device of claim 9, wherein at least one processor type is designated for processing control packets of a flow.
 11. The network device of claim 9, wherein at least one processor type is designated for processing data packets of a flow.
 12. The network device of claim 2, wherein the frame steering processor is configured to determine the processor type of a processing engine to provide content services treatment to the packet based on an operation to be performed on the packet by the processing engine by determining the processor type of the processing engine to provide content services treatment to the packet based on a priority of the operation to be performed on the packet by the processing engine.
 13. The network device of claim 1, wherein the frame steering processor is configured to perform data path functions on the packet.
 14. The network device of claim 1, wherein the frame steering processor is configured to determine whether the packet is to receive content services treatment based on examination of the packet.
 15. The network device of claim 14, wherein the frame steering processor is configured to determine whether the packet is to receive content services treatment based on examination of a tag appended to the packet.
 16. The network device of claim 15, wherein the tag indicates an output port of the frame steering processor.
 17. The network device of claim 15, wherein the at least one processor of at least one of the processor types is configured for modification of the tag.
 18. The network device of claim 17, wherein the at least one processor of the at least one of the processor types is configured to forward a packet with a modified tag back to the frame steering processor.
 19. The network device of claim 18, wherein the at least one processor of at least one of the processor types is configured to update the tag before forwarding the packet back to the frame steering processor.
 20. The network device of claim 15, wherein the frame steering processor is configured for modification of the tag.
 21. The network device of claim 15, wherein the at least one processor of at least one of the processor types is configured for appending a tag to a packet.
 22. The network device of claim 1, wherein the frame steering processor is configured to determine the processor type of the processing engine to provide content services treatment to the packet based on a quality of service at which the packet is to be delivered.
 23. The network device of claim 1, wherein the frame steering processor is configured to determine the processor type of the processing engine to provide content services treatment to the packet based on a security association which is associated with the packet.
 24. The network device of claim 1, wherein the frame steering processor is configured to determine the processor type of the processing engine to provide content services treatment to the packet based on whether other stages of processing need to be performed on the packet.
 25. The network device of claim 1, further comprising an external bus coupled to the frame steering processor, the external bus being configured for coupling at least one external processing engine to the network device.
 26. The network device of claim 1, wherein the network device is a network switch. 